首页> 外文OA文献 >Dynamic Multi-Arm Bandit Game Based Multi-Agents Spectrum Sharing Strategy Design
【2h】

Dynamic Multi-Arm Bandit Game Based Multi-Agents Spectrum Sharing Strategy Design

机译:基于动态多臂强盗游戏的多智能体频谱共享   战略设计

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

For a wireless avionics communication system, a Multi-arm bandit game ismathematically formulated, which includes channel states, strategies, andrewards. The simple case includes only two agents sharing the spectrum which isfully studied in terms of maximizing the cumulative reward over a finite timehorizon. An Upper Confidence Bound (UCB) algorithm is used to achieve theoptimal solutions for the stochastic Multi-Arm Bandit (MAB) problem. Also, theMAB problem can also be solved from the Markov game framework perspective.Meanwhile, Thompson Sampling (TS) is also used as benchmark to evaluate theproposed approach performance. Numerical results are also provided regardingminimizing the expectation of the regret and choosing the best parameter forthe upper confidence bound.
机译:对于无线航空电子通信系统,以数学方式制定了多臂强盗游戏,其中包括通道状态,策略和奖励。这个简单的案例仅包括两个共享频谱的代理,对频谱进行了充分研究,以在有限的时间范围内最大化累积奖励。为了解决随机多臂强盗(MAB)问题的最优解,采用了上限可信度(UCB)算法。此外,MAB问题也可以从马尔可夫博弈框架的角度解决。同时,汤普森抽样(TS)也用作基准来评估拟议的进近性能。还提供了有关最小化后悔期望并为置信上限选择最佳参数的数值结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号